当测试图像提出看不见的分布时,深层分割模型通常会面临故障风险。改善模型鲁棒性针对这些风险的鲁棒性对于深层模型的大规模临床应用至关重要。在这项研究中,受到人类学习周期的启发,我们提出了一个新颖的在线反思学习框架(REFSEG),以改善细分鲁棒性。基于启用概念的反射概念,我们的refseg首先驱动了深层模型以采取行动以获得语义分割。然后,refseg触发模型以反映自身。因为使深层模型在测试过程中意识到他们的细分失败是具有挑战性的,所以RefSeg合成了从语义面具中综合的逼真的代理图像,以帮助深层模型构建直观有效的反射。该代理翻译并强调了分割缺陷。通过最大程度地提高原始输入和代理之间的结构相似性,可以改善分割鲁棒性的反射循环。 REFSEG在测试阶段运行,并且是分割模型的一般性。通过公共心脏MR数据集和两个内部大型超声数据集对三个医疗图像细分任务进行了广泛的验证,这表明我们的refseg显着提高了模型的鲁棒性,并报告了与强大竞争对手有关的最先进的表现。
translated by 谷歌翻译
联合学习(FL)可以培训全球模型,而无需共享存储在多个设备上的分散的原始数据以保护数据隐私。由于设备的能力多样化,FL框架难以解决Straggler效应和过时模型的问题。此外,数据异质性在FL训练过程中会导致全球模型的严重准确性降解。为了解决上述问题,我们提出了一个层次同步FL框架,即Fedhisyn。 Fedhisyn首先根据其计算能力将所有可​​用的设备簇分为少数类别。经过一定的本地培训间隔后,将不同类别培训的模型同时上传到中央服务器。在单个类别中,设备根据环形拓扑会相互传达局部更新的模型权重。随着环形拓扑中训练的效率更喜欢具有均匀资源的设备,基于计算能力的分类减轻了Straggler效应的影响。此外,多个类别的同步更新与单个类别中的设备通信的组合有助于解决数据异质性问题,同时达到高精度。我们评估了基于MNIST,EMNIST,CIFAR10和CIFAR100数据集的提议框架以及设备的不同异质设置。实验结果表明,在训练准确性和效率方面,Fedhisyn的表现优于六种基线方法,例如FedAvg,脚手架和Fedat。
translated by 谷歌翻译
图形离群值检测是一项具有许多应用程序的新兴但至关重要的机器学习任务。尽管近年来算法扩散,但缺乏标准和统一的绩效评估设置限制了它们在现实世界应用中的进步和使用。为了利用差距,我们(据我们所知)(据我们所知)第一个全面的无监督节点离群值检测基准为unod,并带有以下亮点:(1)评估骨架从经典矩阵分解到最新图形神经的骨架的14个方法网络; (2)在现实世界数据集上使用不同类型的注射异常值和自然异常值对方法性能进行基准测试; (3)通过在不同尺度的合成图上使用运行时和GPU存储器使用算法的效率和可扩展性。基于广泛的实验结果的分析,我们讨论了当前渠道方法的利弊,并指出了多个关键和有希望的未来研究方向。
translated by 谷歌翻译
对人类法官和现有的NLP系统,受人尊敬和屈尊的语言(PCL)具有巨大的有害影响,很难检测到。在Semeval-2022任务4中,我们提出了一个基于变压器的新型模型及其合奏,以准确了解PCL检测的这种语言上下文。为了促进对PCL的微妙和主观性质的理解,采用两种微调策略来捕获不同语言行为和分类分布的歧视性特征。该系统在官方排名中取得了显着的结果,包括子任务中的1和第5位。
translated by 谷歌翻译
文档级别的情感分析(DSA)由于含糊的语义链接并使情感信息复杂化,因此更具挑战性。最近的工作专门用于利用文本摘要,并取得了令人鼓舞的结果。但是,这些基于摘要的方法没有充分利用摘要,包括忽略摘要和文档之间的固有交互。结果,他们将代表限制在文档中表达主要点,这高度表明了关键情绪。在本文中,我们研究了如何有效地产生具有明确的主题模式和情感环境的歧视性表示。提出了一个分层互动网络(HIN),以探索多个粒度的摘要和文档之间的双向交互,并学习以主题为导向的文档表示情感分类。此外,我们通过使用情感标签信息来完善HIN来学习基于情感的重新思考机制(SR),以学习更感知的文档表示。我们在三个公共数据集上广泛评估了我们提出的模型。实验结果始终证明了我们提出的模型的有效性,并表明HIN-SR优于各种最新方法。
translated by 谷歌翻译
In this paper, we study the problem of knowledge-intensive text-to-SQL, in which domain knowledge is necessary to parse expert questions into SQL queries over domain-specific tables. We formalize this scenario by building a new Chinese benchmark KnowSQL consisting of domain-specific questions covering various domains. We then address this problem by presenting formulaic knowledge, rather than by annotating additional data examples. More concretely, we construct a formulaic knowledge bank as a domain knowledge base and propose a framework (ReGrouP) to leverage this formulaic knowledge during parsing. Experiments using ReGrouP demonstrate a significant 28.2% improvement overall on KnowSQL.
translated by 谷歌翻译
Surgical robot automation has attracted increasing research interest over the past decade, expecting its huge potential to benefit surgeons, nurses and patients. Recently, the learning paradigm of embodied AI has demonstrated promising ability to learn good control policies for various complex tasks, where embodied AI simulators play an essential role to facilitate relevant researchers. However, existing open-sourced simulators for surgical robot are still not sufficiently supporting human interactions through physical input devices, which further limits effective investigations on how human demonstrations would affect policy learning. In this paper, we study human-in-the-loop embodied intelligence with a new interactive simulation platform for surgical robot learning. Specifically, we establish our platform based on our previously released SurRoL simulator with several new features co-developed to allow high-quality human interaction via an input device. With these, we further propose to collect human demonstrations and imitate the action patterns to achieve more effective policy learning. We showcase the improvement of our simulation environment with the designed new features and tasks, and validate state-of-the-art reinforcement learning algorithms using the interactive environment. Promising results are obtained, with which we hope to pave the way for future research on surgical embodied intelligence. Our platform is released and will be continuously updated in the website: https://med-air.github.io/SurRoL/
translated by 谷歌翻译
Brain midline shift (MLS) is one of the most critical factors to be considered for clinical diagnosis and treatment decision-making for intracranial hemorrhage. Existing computational methods on MLS quantification not only require intensive labeling in millimeter-level measurement but also suffer from poor performance due to their dependence on specific landmarks or simplified anatomical assumptions. In this paper, we propose a novel semi-supervised framework to accurately measure the scale of MLS from head CT scans. We formulate the MLS measurement task as a deformation estimation problem and solve it using a few MLS slices with sparse labels. Meanwhile, with the help of diffusion models, we are able to use a great number of unlabeled MLS data and 2793 non-MLS cases for representation learning and regularization. The extracted representation reflects how the image is different from a non-MLS image and regularization serves an important role in the sparse-to-dense refinement of the deformation field. Our experiment on a real clinical brain hemorrhage dataset has achieved state-of-the-art performance and can generate interpretable deformation fields.
translated by 谷歌翻译
Text-to-SQL semantic parsing is an important NLP task, which greatly facilitates the interaction between users and the database and becomes the key component in many human-computer interaction systems. Much recent progress in text-to-SQL has been driven by large-scale datasets, but most of them are centered on English. In this work, we present MultiSpider, the largest multilingual text-to-SQL dataset which covers seven languages (English, German, French, Spanish, Japanese, Chinese, and Vietnamese). Upon MultiSpider, we further identify the lexical and structural challenges of text-to-SQL (caused by specific language properties and dialect sayings) and their intensity across different languages. Experimental results under three typical settings (zero-shot, monolingual and multilingual) reveal a 6.1% absolute drop in accuracy in non-English languages. Qualitative and quantitative analyses are conducted to understand the reason for the performance drop of each language. Besides the dataset, we also propose a simple schema augmentation framework SAVe (Schema-Augmentation-with-Verification), which significantly boosts the overall performance by about 1.8% and closes the 29.5% performance gap across languages.
translated by 谷歌翻译
Table-and-text hybrid question answering (HybridQA) is a widely used and challenging NLP task commonly applied in the financial and scientific domain. The early research focuses on migrating other QA task methods to HybridQA, while with further research, more and more HybridQA-specific methods have been present. With the rapid development of HybridQA, the systematic survey is still under-explored to summarize the main techniques and advance further research. So we present this work to summarize the current HybridQA benchmarks and methods, then analyze the challenges and future directions of this task. The contributions of this paper can be summarized in three folds: (1) first survey, to our best knowledge, including benchmarks, methods and challenges for HybridQA; (2) systematic investigation with the reasonable comparison of the existing systems to articulate their advantages and shortcomings; (3) detailed analysis of challenges in four important dimensions to shed light on future directions.
translated by 谷歌翻译